CA An overview of the major issues and initiatives in digital preservation at the Library of Congress. "In the medium term, the National Digital Library Program is focusing on two operational approaches. First, steps are taken during conversion that are likely to make migration or emulation less costly when they are needed. Second, the bit streams generated by the conversion process are kept alive through replication and routine refreshing supported by integrity checks. The practices described here provide examples of how those steps are implemented to keep the content of American Memory alive."
Phrases
<P1> The practices described here should not be seen as policies of the Library of Congress; nor are they suggested as best practices in any absolute sense. NDLP regards them as appropriate practices based on real experience, the nature and content of the originals, the primary purposes of the digitization, the state of technology, the availability of resources, the scale of the American Memory digital collection, and the goals of the program. They cover not just the storage of content and associated metadata, but also aspects of initial capture and quality review that support the long-term retention of content digitized from analog sources. <P2> The Library recognizes that digital information resources, whether born digital or converted from analog forms, should be acquired, used, and served alongside traditional resources in the same format or subject area. Such responsibility will include ensuring that effective access is maintained to the digital content through American Memory and via the Library's main catalog and, in coordination with the units responsible for the technical infrastructure, planning migration to new technology when needed. <P3> Refreshing can be carried out in a largely automated fashion on an ongoing basis. Migration, however, will require substantial resources, in a combination of processing time, out-sourced contracts, and staff time. Choice of appropriate formats for digital masters will defer the need for large-scale migration. Integrity checks and appropriate capture of metadata during the initial capture and production process will reduce the resource requirements for future migration steps. <warrant> We can be certain that migration of content to new data formats will be necessary at some point. The future will see industrywide adoption of new data formats with functional advantages over current standards. However, it will be difficult to predict exactly which metadata will be useful to support migration, when migration of master formats will be needed, and the nature and extent of resource needs. Human experts will need to decide when to undertake migration and develop tools for each migration step. <P4> Effective preservation of resources in digital form requires (a) attention early in the life-cycle, at the moment of creation, publication, or acquisition and (b) ongoing management (with attendant costs) to ensure continuing usability. <P5> The National Digital Library Program has identified several categories of metadata needed to support access and management for digital content. Descriptive metadata supports discovery through search and browse functions. Structural metadata supports presentation of complex objects by representing relationships between components, such as sequences of images. In addition, administrative metadata is needed to support management tasks, such as access control, archiving, and migration. Individual metadata elements may support more than one function, but the categorization of elements by function has proved useful. <P6> It has been recognized that metadata representations appropriate for manipulation and long-term retention may not always be appropriate for real-time delivery. <P7> It has also been realized that some basic descriptive metadata (at the very least a title or brief description) should be associated with the structural and administrative metadata. <P8> During 1999, an internal working group reviewed past experience and prototype exercises and compiled a core set of metadata elements that will serve the different functions identified. This set will be tested and refined as part of pilot activities during 2000. <P9> Master formats are well documented and widely deployed, preferably formal standards and preferably non-proprietary. Such choices should minimize the need for future migration or ensure that appropriate and affordable tools for migration will be developed by the industry. <warrant>
Conclusions
RQ "Developing long-term strategies for preserving digital resources presents challenges associated with the uncertainties of technological change. There is currently little experience on which to base predictions of how often migration to new formats will be necessary or desirable or whether emulation will prove cost-effective for certain categories of resources. ... Technological advances, while sure to present new challenges, will also provide new solutions for preserving digital content."